尽管骰子损失是医学图像分割中的主要损失函数之一,但大多数研究都忽略了其导数,即使用梯度下降时优化的真实电动机。在本文中,我们强调了在缺少或空的标签存在下骰子丢失的特殊作用。首先,我们制定一个理论基础,对骰子丢失及其导数进行了一般描述。事实证明,减少尺寸$ \ phi $和平滑项$ \ epsilon $的选择是无处不在的,并且极大地影响了其行为。我们找到并提出了$ \ phi $和$ \ epsilon $的启发式组合,它们在细分设置中使用,带有缺失或空标签。其次,我们使用两个公开可用的数据集在二进制和多类分段设置中验证这些发现。我们确认,$ \ phi $和$ \ epsilon $的选择确实是关键的。选择了$ \ phi $,因此减少的单个元素(和类)元素以及可忽略不计的$ \ epsilon $进行,骰子损失与缺失的标签自然交易,并且与最近缺少标签的最新适应性相似。选择$ \ phi $,以使减少量发生在多个批处理元素上,或以$ \ epsilon $的启发式值进行,骰子损失正确处理空标签。我们认为,这项工作强调了一些基本观点,并希望它鼓励研究人员更好地描述他们对未来工作中骰子损失的确切实施。
translated by 谷歌翻译
Convolutional Neural Networks (CNNs) with U-shaped architectures have dominated medical image segmentation, which is crucial for various clinical purposes. However, the inherent locality of convolution makes CNNs fail to fully exploit global context, essential for better recognition of some structures, e.g., brain lesions. Transformers have recently proven promising performance on vision tasks, including semantic segmentation, mainly due to their capability of modeling long-range dependencies. Nevertheless, the quadratic complexity of attention makes existing Transformer-based models use self-attention layers only after somehow reducing the image resolution, which limits the ability to capture global contexts present at higher resolutions. Therefore, this work introduces a family of models, dubbed Factorizer, which leverages the power of low-rank matrix factorization for constructing an end-to-end segmentation model. Specifically, we propose a linearly scalable approach to context modeling, formulating Nonnegative Matrix Factorization (NMF) as a differentiable layer integrated into a U-shaped architecture. The shifted window technique is also utilized in combination with NMF to effectively aggregate local information. Factorizers compete favorably with CNNs and Transformers in terms of accuracy, scalability, and interpretability, achieving state-of-the-art results on the BraTS dataset for brain tumor segmentation and ISLES'22 dataset for stroke lesion segmentation. Highly meaningful NMF components give an additional interpretability advantage to Factorizers over CNNs and Transformers. Moreover, our ablation studies reveal a distinctive feature of Factorizers that enables a significant speed-up in inference for a trained Factorizer without any extra steps and without sacrificing much accuracy. The code and models are publicly available at https://github.com/pashtari/factorizer.
translated by 谷歌翻译
机器学习驱动的医学图像分割已成为医学图像分析的标准。然而,深度学习模型易于过度自信预测。这导致了重新关注医学成像和更广泛的机器学习社区中的校准预测。校准预测是标签概率的估计,其对应于置信度的标签的真正预期值。这种校准的预测在一系列医学成像应用中具有效用,包括在不确定性和主动学习系统下的手术规划。同时,它通常是对许多医疗应用的实际重视的准确体积测量。这项工作调查了模型校准和体积估计之间的关系。我们在数学上和经验上展示,如果每个图像校准预测器,我们可以通过期望每像素/图像的体素的概率得分来获得正确的体积。此外,我们表明校准分类器的凸组合保持体积估计,但不保留校准。因此,我们得出结论,具有校准的预测因子是足够但不是必需的来获得体积的无偏估计。我们验证了我们对18种不同(校准的)培训策略的主题验证了我们关于Brats 2018的胶质瘤体积估计的任务的集合,以及Isles 2018数据集的缺血性卒中病变估计。
translated by 谷歌翻译
最近关于Covid-19的研究表明,CT成像提供了评估疾病进展和协助诊断的有用信息,以及帮助理解疾病。有越来越多的研究,建议使用深度学习来使用胸部CT扫描提供快速准确地定量Covid-19。兴趣的主要任务是胸部CT扫描的肺和肺病变的自动分割,确认或疑似Covid-19患者。在这项研究中,我们使用多中心数据集比较12个深度学习算法,包括开源和内部开发的算法。结果表明,合并不同的方法可以提高肺部分割,二元病变分割和多种子病变分割的总体测试集性能,从而分别为0.982,0.724和0.469的平均骰子分别。将得到的二元病变分段为91.3ml的平均绝对体积误差。通常,区分不同病变类型的任务更加困难,分别具有152mL的平均绝对体积差,分别为整合和磨碎玻璃不透明度为0.369和0.523的平均骰子分数。所有方法都以平均体积误差进行二元病变分割,该分段优于人类评估者的视觉评估,表明这些方法足以用于临床实践中使用的大规模评估。
translated by 谷歌翻译
Segmentation of lidar data is a task that provides rich, point-wise information about the environment of robots or autonomous vehicles. Currently best performing neural networks for lidar segmentation are fine-tuned to specific datasets. Switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift, which causes the network performance to drop drastically. In this work we propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor. We narrow the domain gap to the target data by recreating panoptic data from one domain in another and mixing the generated data with parts of (pseudo) labeled target domain data. Our method improves the nuScenes to SemanticKITTI unsupervised domain adaptation performance by 15.2 mean Intersection over Union points (mIoU) and by 48.3 mIoU in our semi-supervised approach. We demonstrate a similar improvement for the SemanticKITTI to nuScenes domain adaptation by 21.8 mIoU and 51.5 mIoU, respectively. We compare our method with two state of the art approaches for semantic lidar segmentation domain adaptation with a significant improvement for unsupervised and semi-supervised domain adaptation. Furthermore we successfully apply our proposed method to two entirely unlabeled datasets of two state of the art lidar sensors Velodyne Alpha Prime and InnovizTwo, and train well performing semantic segmentation networks for both.
translated by 谷歌翻译
Explainable AI (XAI) is slowly becoming a key component for many AI applications. Rule-based and modified backpropagation XAI approaches however often face challenges when being applied to modern model architectures including innovative layer building blocks, which is caused by two reasons. Firstly, the high flexibility of rule-based XAI methods leads to numerous potential parameterizations. Secondly, many XAI methods break the implementation-invariance axiom because they struggle with certain model components, e.g., BatchNorm layers. The latter can be addressed with model canonization, which is the process of re-structuring the model to disregard problematic components without changing the underlying function. While model canonization is straightforward for simple architectures (e.g., VGG, ResNet), it can be challenging for more complex and highly interconnected models (e.g., DenseNet). Moreover, there is only little quantifiable evidence that model canonization is beneficial for XAI. In this work, we propose canonizations for currently relevant model blocks applicable to popular deep neural network architectures,including VGG, ResNet, EfficientNet, DenseNets, as well as Relation Networks. We further suggest a XAI evaluation framework with which we quantify and compare the effect sof model canonization for various XAI methods in image classification tasks on the Pascal-VOC and ILSVRC2017 datasets, as well as for Visual Question Answering using CLEVR-XAI. Moreover, addressing the former issue outlined above, we demonstrate how our evaluation framework can be applied to perform hyperparameter search for XAI methods to optimize the quality of explanations.
translated by 谷歌翻译
Autonomous vehicles currently suffer from a time-inefficient driving style caused by uncertainty about human behavior in traffic interactions. Accurate and reliable prediction models enabling more efficient trajectory planning could make autonomous vehicles more assertive in such interactions. However, the evaluation of such models is commonly oversimplistic, ignoring the asymmetric importance of prediction errors and the heterogeneity of the datasets used for testing. We examine the potential of recasting interactions between vehicles as gap acceptance scenarios and evaluating models in this structured environment. To that end, we develop a framework facilitating the evaluation of any model, by any metric, and in any scenario. We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.
translated by 谷歌翻译
Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications. However, collaboration between these actors is difficult due to the heterogeneous nature of geospatial data modalities (e.g., multi-spectral images of various resolutions, timeseries, weather data) and diversity of tasks (e.g., regression of human activity indicators or detecting forest fires). In this work, we present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias, pre-trained on large amounts of unlabelled earth observation data in a self-supervised manner. We envision how such a model may facilitate cooperation between members of the community. We show preliminary results on the first step of the roadmap, where we instantiate an architecture that can process a wide variety of geospatial data modalities and demonstrate that it can achieve competitive performance with domain-specific architectures on tasks relating to the U.N.'s Sustainable Development Goals.
translated by 谷歌翻译
封闭的量子机械系统的物理学受哈密顿量的约束。但是,在大多数实际情况下,这种哈密顿量尚不清楚,最终所有的数据是从系统上的测量中获得的数据。在这项工作中,我们通过将基于机器学习的基于梯度的优化从机器学习中从张量量的网络中从机器学习中从基于梯度的优化中汇总到从基于梯度的优化的技术中汇总到从动力学数据中进行交互的多体汉密尔顿人来学习的家庭。我们的方法非常实用,实验友好且本质上可扩展,以使系统尺寸超过100次旋转。特别是,我们在综合数据上证明了算法的工作原理,即使仅限于一个简单的初始状态,少量的单量观测和时间演变为相对较短的时间。对于一维海森贝格模型的具体示例,我们的算法在系统大小和缩放的误差常数中作为数据集大小的反平方根。
translated by 谷歌翻译
从不同的随机初始化开始,经过随机梯度下降(SGD)训练的神经网络通常在功能上非常相似,从而提出了一个问题,即不同的SGD溶液之间是否存在有意义的差异。 Entezari等。最近猜想,尽管初始化不同,但在考虑到神经网络的置换不变性后,SGD发现的解决方案位于相同的损失谷中。具体而言,他们假设可以将SGD找到的任何两种解决方案排列,以使其参数之间的线性插值形成一条路径,而不会显着增加损失。在这里,我们使用一种简单但功能强大的算法来找到这样的排列,使我们能够获得直接的经验证据,证明该假设在完全连接的网络中是正确的。引人注目的是,我们发现在初始化时已经存在两个网络,并且平均它们随机,但适当排列的初始化的性能大大高于机会。相反,对于卷积架构,我们的证据表明该假设不存在。特别是在大型学习率制度中,SGD似乎发现了各种模式。
translated by 谷歌翻译